Skip to content

Conversation

@PiJoules
Copy link
Contributor

No description provided.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. labels Nov 11, 2024
@llvmbot
Copy link
Member

llvmbot commented Nov 11, 2024

@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-clang

Author: None (PiJoules)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/115788.diff

2 Files Affected:

  • (modified) clang/lib/CodeGen/ItaniumCXXABI.cpp (+8)
  • (added) clang/test/CodeGenCXX/fuchsia-global-dtor.cpp (+25)
diff --git a/clang/lib/CodeGen/ItaniumCXXABI.cpp b/clang/lib/CodeGen/ItaniumCXXABI.cpp
index 9b3c2f1b2af677..7dda27e02e9ea1 100644
--- a/clang/lib/CodeGen/ItaniumCXXABI.cpp
+++ b/clang/lib/CodeGen/ItaniumCXXABI.cpp
@@ -518,6 +518,14 @@ class FuchsiaCXXABI final : public ItaniumCXXABI {
   explicit FuchsiaCXXABI(CodeGen::CodeGenModule &CGM)
       : ItaniumCXXABI(CGM) {}
 
+  // Rather than using the defaul [__cxa_]atexit, instead use llvm.global_dtors
+  // which will result in .fini_array being used.
+  void registerGlobalDtor(CodeGenFunction &CGF, const VarDecl &D,
+                          llvm::FunctionCallee dtor,
+                          llvm::Constant *addr) override {
+    return CGM.AddCXXDtorEntry(dtor, addr);
+  }
+
 private:
   bool constructorsAndDestructorsReturnThis() const override { return true; }
 };
diff --git a/clang/test/CodeGenCXX/fuchsia-global-dtor.cpp b/clang/test/CodeGenCXX/fuchsia-global-dtor.cpp
new file mode 100644
index 00000000000000..ff3066059af91d
--- /dev/null
+++ b/clang/test/CodeGenCXX/fuchsia-global-dtor.cpp
@@ -0,0 +1,25 @@
+/// Global destructors targetting Fuchsia should not use [__cxa_]atexit. Instead
+/// they should be invoked through llvm.global_dtors.
+
+// RUN: %clang_cc1 %s -triple aarch64-unknown-fuchsia -emit-llvm -o - | FileCheck %s
+
+// CHECK-NOT: atexit
+
+// CHECK:      @llvm.global_dtors = appending global [1 x { i32, ptr, ptr }]
+// CHECK-SAME:   [{ i32, ptr, ptr } { i32 {{.*}}, ptr [[MODULE_DTOR:@.*]], ptr {{.*}} }]
+
+// CHECK:      define internal void [[MODULE_DTOR]]() {{.*}}{
+// CHECK-NEXT: entry:
+// CHECK-NEXT:   %0 = call ptr @_ZN1AD1Ev(ptr @DestroyFirst)
+// CHECK-NEXT:   %1 = call ptr @_ZN1AD1Ev(ptr @DestroySecond)
+// CHECK-NEXT:   %2 = call ptr @_ZN1AD1Ev(ptr @DestroyThird)
+// CHECK-NEXT:   ret void
+// CHECK-NEXT: }
+
+struct A {
+  ~A() {}
+};
+
+A DestroyThird;
+A DestroySecond;
+A DestroyFirst;

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we can just unconditionally call the destructor. We need to ensure the constructor runs first. If a constructor throws an exception or calls exit(), not all constructors will run before we start cleaning up. This works out naturally with atexit... but if you use global_dtors, I think you end up calling the destructor on an uninitialized object.

@efriedma-quic
Copy link
Collaborator

Also, I'm not sure objects are destroyed in the correct order.


As far as I know, fini_array existed at the time the Itanium ABI was written; I assume it intentionally was not used because there were issues ensuring the correct objects are destroyed in the correct order.

@PiJoules
Copy link
Contributor Author

PiJoules commented Nov 13, 2024

I'm not sure we can just unconditionally call the destructor. We need to ensure the constructor runs first. If a constructor throws an exception or calls exit(), not all constructors will run before we start cleaning up. This works out naturally with atexit... but if you use global_dtors, I think you end up calling the destructor on an uninitialized object.

Ah yeah you're right. I didn't consider this. Since we're willing to make ABI changes, my colleague @mysterymath came up with another idea that might help address this:

  1. Instrument in a dso_local global counter that indicates how many ctors we successfully called.
  2. After each successful ctor call for each global, increment the counter by 1.
  3. Inside the function that would be added to llvm.global_dtors and would invoke all the dtors, have a global switch that jumps to the appropriate case based on the number of correctly initialized globals. This would be akin to Duff's device.
// Given globals:
//
//   A first;
//   A second;
//   A third;

// Using C as pseudo-code instead of IR.
static int SuccessfulInits = 0;

void module_ctor() {
  construct_A(&first);
  SuccessfulInits++;

  construct_A(&second);
  SuccessfulInits++;

  construct_A(&third);
  SuccessfulInits++;
}

void module_dtor() {
  switch (SuccessfulInits) {
    case 3: destruct_A(&third);  // fallthrough
    case 2: destruct_A(&second);  // falthrough
    case 1: destruct_A(&first);  // fallthrough
    case 0: break;  // No destructors invoked
  }
}

An alternative approach might be using a lookup table with a decrementing pointer:

void __dtor_first() {
  destruct_A(&first);
}

void __dtor_second() {
  destruct_A(&second);
}


void __dtor_third() {
  destruct_A(&third);
}

void (*dtor_list[])() = {__dtor_first, __dtor_second, __dtor_third};
void (*dtor_ptr)() = dtor_list;

void module_ctor() {
  construct_A(&first);
  dtor_ptr++;

  construct_A(&second);
  dtor_ptr++;

  construct_A(&third);
  dtor_ptr++;
}

void module_dtor() {
  while (dtor_ptr != dtor_list) {
    (--dtor_ptr)();  // Invoke one of the functions in dtor_list
  }
}

With either of these, I think we guarantee that we only call the dtors of globals that were successfully constructed even if a ctor calls exit or throws an exception.

Also, I'm not sure objects are destroyed in the correct order.

Do you mean for the test case I added? I think they should be destroyed in reverse order.

@efriedma-quic
Copy link
Collaborator

efriedma-quic commented Nov 13, 2024

To ensure we don't destroy objects that weren't constructed, you can use a guard variable of some sort, sure. I'm not quite sure what the goal is, here; are you trying to improve performance, or are you worried about the function pointers for security?

No, I mean for objects in different translation units. I think the same translation unit works like you think. And different translation units maybe works for non-inline variables, assuming the loader iterates in the right direction?

I think you might run into issues with inline variables? Not sure if this patch actually changes behavior for them.

@PiJoules
Copy link
Contributor Author

(Sorry I think I accidentally edited your last comment rather than adding a new one)

To ensure we don't destroy objects that weren't constructed, you can use a guard variable of some sort, sure. I'm not quite sure what the goal is, here; are you trying to improve performance, or are you worried about the function pointers for security?

Fewer atexit calls means fewer calls to malloc since the number of callbacks can dynamically grow.

No, I mean for objects in different translation units. I think the same translation unit works like you think. And different translation units maybe works for non-inline variables, assuming the loader iterates in the right direction?

I think you might run into issues with inline variables? Not sure if this patch actually changes behavior for them.

Good points, I'll have to write more tests for this

@mysterymath
Copy link
Contributor

mysterymath commented Nov 13, 2024

To ensure we don't destroy objects that weren't constructed, you can use a guard variable of some sort, sure. I'm not quite sure what the goal is, here; are you trying to improve performance, or are you worried about the function pointers for security?

The scenario that brought this up was constinit globals still generating constructors that just call __cxa_atexit; this behavior was regarded as prima facie "odd", since it invokes quite a lot of mechanism just to register statically-known destructors.

With an approach like this, such constructors wouldn't need to be generated; the little state machine in the module constructor function would take it as granted that such "constructors" executed successfully. We'd probably need some additional mechanism in Clang's data structures to keep track of the relationship between these trivial constructors and their destructors though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants